F distribution (f)#

The F (Fisher–Snedecor) distribution in SciPy (scipy.stats.f) is a continuous distribution on \((0,\infty)\) that arises as a ratio of two independent scaled chi-square variables.

It is the workhorse behind ANOVA, overall regression significance tests, and classic tests comparing variances.


Learning goals#

  • Understand the chi-square ratio generative story and why the distribution is heavy-tailed.

  • Write down the PDF/CDF (and the Beta-function connection).

  • Derive the mean/variance and understand when moments do not exist.

  • Sample from \(\mathrm{F}(\mathrm{dfn},\mathrm{dfd})\) using a NumPy-only algorithm.

  • Use scipy.stats.f for evaluation, simulation, and fitting.

import numpy as np

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

import scipy
from scipy import optimize
from scipy import stats
from scipy.special import beta as beta_func
from scipy.special import betainc, betaln, digamma

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

np.set_printoptions(precision=5, suppress=True)
rng = np.random.default_rng(42)

# Record versions for reproducibility (useful when numerical details matter).
VERSIONS = {"numpy": np.__version__, "scipy": scipy.__version__, "plotly": plotly.__version__}
VERSIONS
{'numpy': '1.26.2', 'scipy': '1.15.0', 'plotly': '6.5.2'}

1) Title & Classification#

  • Name: f (F distribution; Fisher–Snedecor; SciPy: scipy.stats.f)

  • Type: Continuous

  • Support (standard form): \(x \in (0,\infty)\)

  • Parameter space (standard form): numerator df \(\mathrm{dfn} > 0\), denominator df \(\mathrm{dfd} > 0\)

  • SciPy location/scale: loc \in \mathbb{R}, scale > 0 with $\(X = \text{loc} + \text{scale}\,Y, \qquad Y \sim \mathrm{F}(\mathrm{dfn},\mathrm{dfd}).\)$

Unless stated otherwise, this notebook works with the standard form (loc=0, scale=1).

2) Intuition & Motivation#

What it models#

The F distribution models a ratio of (noise) energy levels.

A standard generative story is:

  1. Draw \(U \sim \chi^2_{\mathrm{dfn}}\) and \(V \sim \chi^2_{\mathrm{dfd}}\) independently.

  2. Normalize each by its degrees of freedom.

  3. Take their ratio:

\[ X = \frac{U/\mathrm{dfn}}{V/\mathrm{dfd}} \;\sim\; \mathrm{F}(\mathrm{dfn},\mathrm{dfd}). \]

If \(U/\mathrm{dfn}\) and \(V/\mathrm{dfd}\) are thought of as variance-like quantities (they are averages of squared standard normals), then \(X\) is a variance ratio.

Typical real-world use cases#

  • ANOVA: compares explained variance to unexplained variance.

  • Regression F-test: tests whether a set of predictors improves fit beyond a baseline model.

  • Comparing variances: under Gaussian assumptions, \(s_1^2/s_2^2\) follows an F law under the null \(\sigma_1^2=\sigma_2^2\).

Relations to other distributions#

  • t distribution: if \(T \sim t_{\nu}\) then \(T^2 \sim \mathrm{F}(1,\nu)\).

  • Reciprocal symmetry: if \(X \sim \mathrm{F}(d_1,d_2)\) then \(1/X \sim \mathrm{F}(d_2,d_1)\).

  • Beta prime / Beta connection: if \(B \sim \mathrm{Beta}(a,b)\) with \(a=d_1/2\), \(b=d_2/2\), then $\(X = \frac{d_2}{d_1}\,\frac{B}{1-B} \sim \mathrm{F}(d_1,d_2).\)$ This identity explains why the CDF involves the regularized incomplete beta function.

3) Formal Definition#

Let \(d_1=\mathrm{dfn}>0\) and \(d_2=\mathrm{dfd}>0\).

PDF#

For \(x>0\) the F distribution has density

\[\begin{split} \begin{aligned} f(x; d_1,d_2) &= \frac{\left(\frac{d_1}{d_2}\right)^{d_1/2}}{\mathrm{B}(d_1/2,\,d_2/2)}\;\frac{x^{\,d_1/2-1}}{\left(1+\frac{d_1}{d_2}x\right)^{(d_1+d_2)/2}}, \qquad x>0, \\ f(x; d_1,d_2) &= 0, \qquad x\le 0, \end{aligned} \end{split}\]

where \(\mathrm{B}(\cdot,\cdot)\) is the Beta function.

CDF#

Define the transformation

\[ u(x) = \frac{d_1 x}{d_1 x + d_2} \in (0,1). \]

Then the CDF is

\[ F(x; d_1,d_2) = \mathrm{I}_{u(x)}\!\left(\frac{d_1}{2},\frac{d_2}{2}\right), \]

where \(\mathrm{I}_z(a,b)\) is the regularized incomplete beta function.

def f_logpdf(x: np.ndarray, dfn: float, dfd: float) -> np.ndarray:
    """Log-PDF of the standard F(dfn, dfd) distribution (loc=0, scale=1)."""

    dfn = float(dfn)
    dfd = float(dfd)
    if dfn <= 0 or dfd <= 0:
        raise ValueError("dfn and dfd must be > 0")

    x = np.asarray(x, dtype=float)
    out = np.full_like(x, -np.inf, dtype=float)

    mask = x > 0
    xm = x[mask]

    a = 0.5 * dfn
    b = 0.5 * dfd
    ratio = dfn / dfd

    # log f(x) = a log(dfn/dfd) - log B(a,b) + (a-1) log x - (a+b) log(1 + (dfn/dfd) x)
    out[mask] = (
        a * np.log(ratio)
        - betaln(a, b)
        + (a - 1.0) * np.log(xm)
        - (a + b) * np.log1p(ratio * xm)
    )
    return out


def f_pdf(x: np.ndarray, dfn: float, dfd: float) -> np.ndarray:
    """PDF of the standard F(dfn, dfd) distribution (loc=0, scale=1)."""
    return np.exp(f_logpdf(x, dfn, dfd))


def f_cdf(x: np.ndarray, dfn: float, dfd: float) -> np.ndarray:
    """CDF of the standard F(dfn, dfd) distribution (loc=0, scale=1)."""

    dfn = float(dfn)
    dfd = float(dfd)
    if dfn <= 0 or dfd <= 0:
        raise ValueError("dfn and dfd must be > 0")

    x = np.asarray(x, dtype=float)
    out = np.zeros_like(x, dtype=float)

    mask = x > 0
    xm = x[mask]

    a = 0.5 * dfn
    b = 0.5 * dfd

    # Stable transform u = dfn*x/(dfn*x+dfd) = 1 / (1 + dfd/(dfn*x))
    u = 1.0 / (1.0 + (dfd / (dfn * xm)))
    out[mask] = betainc(a, b, u)
    return out
# Sanity check: our formulas match SciPy.
dfn, dfd = 5.0, 10.0
x = np.logspace(-3, 2, 25)

dist = stats.f(dfn, dfd)

assert np.allclose(f_pdf(x, dfn, dfd), dist.pdf(x))
assert np.allclose(f_cdf(x, dfn, dfd), dist.cdf(x))
assert np.allclose(f_logpdf(x, dfn, dfd), dist.logpdf(x))

4) Moments & Properties#

A key fact is that the F distribution has polynomial tails, so not all moments exist.

Let \(a=d_1/2\) and \(b=d_2/2\).

Raw moments#

For \(k < b\) (equivalently \(d_2 > 2k\)), the \(k\)-th raw moment exists and is

\[ \mathbb{E}[X^k] = \left(\frac{d_2}{d_1}\right)^k\,\frac{\mathrm{B}(a+k,\,b-k)}{\mathrm{B}(a,b)}. \]

This immediately implies:

  • mean exists iff \(d_2>2\)

  • variance exists iff \(d_2>4\)

  • skewness exists iff \(d_2>6\)

  • (excess) kurtosis exists iff \(d_2>8\)

Mean / variance / skewness / kurtosis#

When the corresponding moments exist:

\[ \mathbb{E}[X] = \frac{d_2}{d_2-2}, \qquad (d_2>2) \]
\[ \mathrm{Var}(X) = \frac{2 d_2^2(d_1+d_2-2)}{d_1(d_2-2)^2(d_2-4)}, \qquad (d_2>4) \]
\[ \gamma_1 = \frac{(2d_1 + d_2 - 2)\sqrt{8(d_2-4)}}{(d_2-6)\sqrt{d_1(d_1+d_2-2)}}, \qquad (d_2>6) \]
\[ \gamma_2 = \frac{12\left[d_1(5d_2-22)(d_1+d_2-2) + (d_2-4)(d_2-2)^2\right]}{d_1(d_2-6)(d_2-8)(d_1+d_2-2)}, \qquad (d_2>8) \]

where \(\gamma_2\) is the excess kurtosis.

Mode#

If \(d_1>2\), the mode is

\[ \operatorname{mode}(X) = \frac{d_2(d_1-2)}{d_1(d_2+2)}. \]

MGF / characteristic function#

  • The moment generating function \(M_X(t)=\mathbb{E}[e^{tX}]\) does not exist for any \(t>0\) (the tail is too heavy).

  • The Laplace transform \(\mathbb{E}[e^{tX}]\) exists for \(t<0\).

  • The characteristic function \(\varphi_X(t)=\mathbb{E}[e^{itX}]\) exists for all real \(t\) because \(|e^{itX}|\le 1\).

A generic integral representation is

\[ \varphi_X(t) = \int_0^{\infty} e^{itx}\,f(x;d_1,d_2)\,dx. \]

Closed forms involve special functions (hypergeometric functions) and are rarely used directly in practice.

Entropy#

The differential entropy has a closed form:

\[\begin{split} \begin{aligned} h(X) &= -\int_0^{\infty} f(x;d_1,d_2)\,\log f(x;d_1,d_2)\,dx \\ &= \log \mathrm{B}(a,b) - (a-1)\psi(a) - (b+1)\psi(b) + (a+b)\psi(a+b) + \log\left(\frac{d_2}{d_1}\right), \end{aligned} \end{split}\]

where \(\psi\) is the digamma function.

def f_moment_raw(dfn: float, dfd: float, k: float) -> float:
    """Raw moment E[X^k] for X ~ F(dfn, dfd), when it exists (dfd > 2k)."""

    dfn = float(dfn)
    dfd = float(dfd)
    k = float(k)
    if dfn <= 0 or dfd <= 0:
        raise ValueError("dfn and dfd must be > 0")
    if dfd <= 2 * k:
        return np.nan

    a = 0.5 * dfn
    b = 0.5 * dfd
    return (dfd / dfn) ** k * np.exp(betaln(a + k, b - k) - betaln(a, b))


def f_mean(dfn: float, dfd: float) -> float:
    return np.nan if dfd <= 2 else dfd / (dfd - 2)


def f_var(dfn: float, dfd: float) -> float:
    if dfd <= 4:
        return np.nan
    return (2 * dfd**2 * (dfn + dfd - 2)) / (dfn * (dfd - 2) ** 2 * (dfd - 4))


def f_skew(dfn: float, dfd: float) -> float:
    if dfd <= 6:
        return np.nan
    return ((2 * dfn + dfd - 2) * np.sqrt(8 * (dfd - 4))) / ((dfd - 6) * np.sqrt(dfn * (dfn + dfd - 2)))


def f_kurt_excess(dfn: float, dfd: float) -> float:
    if dfd <= 8:
        return np.nan

    num = 12 * (dfn * (5 * dfd - 22) * (dfn + dfd - 2) + (dfd - 4) * (dfd - 2) ** 2)
    den = dfn * (dfd - 6) * (dfd - 8) * (dfn + dfd - 2)
    return num / den


def f_entropy(dfn: float, dfd: float) -> float:
    a = 0.5 * float(dfn)
    b = 0.5 * float(dfd)
    return (
        np.log(beta_func(a, b))
        - (a - 1) * digamma(a)
        - (b + 1) * digamma(b)
        + (a + b) * digamma(a + b)
        + np.log(dfd / dfn)
    )


dfn, dfd = 7.0, 25.0
scipy_m, scipy_v, scipy_s, scipy_k = stats.f(dfn, dfd).stats(moments="mvsk")

out = {
    "mean": (f_mean(dfn, dfd), scipy_m),
    "var": (f_var(dfn, dfd), scipy_v),
    "skew": (f_skew(dfn, dfd), scipy_s),
    "kurt_excess": (f_kurt_excess(dfn, dfd), scipy_k),
    "entropy": (f_entropy(dfn, dfd), stats.f(dfn, dfd).entropy()),
}

out
{'mean': (1.0869565217391304, 1.0869565217391304),
 'var': (0.4822344816943791, 0.4822344816943791),
 'skew': (1.741779266684047, 1.741779266684047),
 'kurt_excess': (5.791950464396285, 5.7919504643962885),
 'entropy': (0.8571957656704647, 0.8571957656704612)}

5) Parameter Interpretation#

The parameters are degrees of freedom:

  • \(d_1=\mathrm{dfn}\) controls the numerator chi-square \(U \sim \chi^2_{d_1}\).

  • \(d_2=\mathrm{dfd}\) controls the denominator chi-square \(V \sim \chi^2_{d_2}\).

How shape changes#

  • Small \(d_2\) means a noisy denominator \(V/d_2\), which creates a very heavy right tail.

    • This is why the mean/variance/skewness/kurtosis stop existing as \(d_2\) crosses \(2,4,6,8\).

  • Larger \(d_1\) pushes the distribution away from 0 (the density behaves like \(x^{d_1/2-1}\) near 0).

Useful limits#

  • As \(d_2 \to \infty\), the denominator concentrates: \(V/d_2 \to 1\), so

    \[\mathrm{F}(d_1,d_2) \Rightarrow \chi^2_{d_1}/d_1.\]
  • As both \(d_1,d_2\) grow, both \(U/d_1\) and \(V/d_2\) concentrate near 1, and the ratio concentrates near 1.

  • Reciprocal symmetry: \(X \sim \mathrm{F}(d_1,d_2)\) implies \(1/X \sim \mathrm{F}(d_2,d_1)\).

x = np.linspace(0.001, 6.0, 600)

# Vary dfd (tail heaviness)
dfn_fixed = 5
params_tail = [(dfn_fixed, 5), (dfn_fixed, 10), (dfn_fixed, 30)]

fig = go.Figure()
for dfn, dfd in params_tail:
    fig.add_trace(go.Scatter(x=x, y=f_pdf(x, dfn, dfd), mode="lines", name=f"dfn={dfn}, dfd={dfd}"))

fig.update_layout(
    title="F PDF: effect of the denominator df (tail heaviness)",
    xaxis_title="x",
    yaxis_title="pdf(x)",
)
fig.show()

# Vary dfn (left/peak behavior)
dfd_fixed = 10
params_peak = [(2.5, dfd_fixed), (5, dfd_fixed), (20, dfd_fixed)]

fig = go.Figure()
for dfn, dfd in params_peak:
    fig.add_trace(go.Scatter(x=x, y=f_pdf(x, dfn, dfd), mode="lines", name=f"dfn={dfn}, dfd={dfd}"))

fig.update_layout(
    title="F PDF: effect of the numerator df (behavior near 0 and peak)",
    xaxis_title="x",
    yaxis_title="pdf(x)",
)
fig.show()

6) Derivations#

This section sketches three useful derivations: the CDF, the mean/variance, and the likelihood.

Deriving the CDF via a Beta transformation#

Let \(X \sim \mathrm{F}(d_1,d_2)\) and define

\[ U = \frac{d_1 X}{d_1 X + d_2} \in (0,1). \]

Using the Beta-prime identity (or a Jacobian change of variables), one can show

\[ U \sim \mathrm{Beta}\left(\frac{d_1}{2},\frac{d_2}{2}\right). \]

Therefore

\[ \mathbb{P}(X\le x) = \mathbb{P}\left(\frac{d_1 X}{d_1 X + d_2} \le \frac{d_1 x}{d_1 x + d_2}\right) = \mathbb{P}\left(U \le u(x)\right) = \mathrm{I}_{u(x)}\!\left(\frac{d_1}{2},\frac{d_2}{2}\right). \]

Deriving moments#

From the Beta connection

\[ X = \frac{d_2}{d_1}\,\frac{B}{1-B},\qquad B\sim\mathrm{Beta}(a,b),\ a=d_1/2,\ b=d_2/2. \]

Then for \(k<b\):

\[ \mathbb{E}[X^k] = \left(\frac{d_2}{d_1}\right)^k\,\mathbb{E}\left[B^k(1-B)^{-k}\right] = \left(\frac{d_2}{d_1}\right)^k\,\frac{\mathrm{B}(a+k,b-k)}{\mathrm{B}(a,b)}. \]

Setting \(k=1\) and \(k=2\) yields \(\mathbb{E}[X]\) and \(\mathbb{E}[X^2]\); then

\[ \mathrm{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2. \]

Likelihood (i.i.d. sample)#

Given data \(x_1,\dots,x_n\) i.i.d. from \(\mathrm{F}(d_1,d_2)\) (standard form), the log-likelihood is

\[\begin{split} \begin{aligned} \ell(d_1,d_2) &= \sum_{i=1}^n \log f(x_i;d_1,d_2)\\ &= n\left[\tfrac{d_1}{2}\log\left(\tfrac{d_1}{d_2}\right) - \log \mathrm{B}(d_1/2,d_2/2)\right] + \left(\tfrac{d_1}{2}-1\right)\sum_{i=1}^n \log x_i - \tfrac{d_1+d_2}{2}\sum_{i=1}^n \log\left(1+\tfrac{d_1}{d_2}x_i\right). \end{aligned} \end{split}\]

There is no closed-form MLE for \((d_1,d_2)\); it is typically found by numerical optimization.

def f_loglik(dfn: float, dfd: float, data: np.ndarray) -> float:
    data = np.asarray(data, dtype=float)
    if np.any(data <= 0):
        return -np.inf
    return float(np.sum(f_logpdf(data, dfn, dfd)))


# Demonstration: recover parameters from synthetic data using log-likelihood.
rng_fit = np.random.default_rng(123)
dfn_true, dfd_true = 6.0, 18.0
sample = stats.f(dfn_true, dfd_true).rvs(size=2000, random_state=rng_fit)


def neg_loglik(theta: np.ndarray) -> float:
    # Optimize in log-space to enforce positivity.
    dfn = float(np.exp(theta[0]))
    dfd = float(np.exp(theta[1]))
    return -f_loglik(dfn, dfd, sample)


theta0 = np.log([5.0, 10.0])
res = optimize.minimize(neg_loglik, theta0, method="Nelder-Mead")

dfn_hat, dfd_hat = np.exp(res.x)

# Compare to SciPy's built-in MLE (fix loc=0, scale=1).
dfn_fit, dfd_fit, loc_fit, scale_fit = stats.f.fit(sample, floc=0, fscale=1)

{
    "true": (dfn_true, dfd_true),
    "mle_manual": (dfn_hat, dfd_hat),
    "mle_scipy_fit": (dfn_fit, dfd_fit),
    "success": res.success,
}
{'true': (6.0, 18.0),
 'mle_manual': (6.26112794339186, 16.819704450943895),
 'mle_scipy_fit': (6.261003935389471, 16.820016818351448),
 'success': True}

7) Sampling & Simulation#

The chi-square ratio representation gives a simple sampling recipe:

  1. Sample \(U \sim \chi^2_{d_1}\) and \(V \sim \chi^2_{d_2}\) independently.

  2. Return \(X = (U/d_1)/(V/d_2)\).

A chi-square random variable is a Gamma:

\[ \chi^2_{\nu} \equiv \mathrm{Gamma}\left(k=\nu/2,\;\text{scale}=2\right). \]

So sampling from F reduces to sampling from a Gamma distribution. Below we implement a NumPy-only Gamma sampler using the Marsaglia–Tsang method.

Marsaglia–Tsang (high level)#

For \(k\ge 1\) (shape):

  • Draw \(Z\sim\mathcal{N}(0,1)\) and set \(V=(1+cZ)^3\).

  • Accept/reject using a cheap test first, then a log test.

For \(k<1\), boost to \(k+1\) and apply a power transform.

This is a standard, fast algorithm used in many libraries.

def gamma_rvs_numpy(shape: float, size: int, rng: np.random.Generator, scale: float = 1.0) -> np.ndarray:
    """Sample Gamma(shape, scale) using NumPy only (Marsaglia–Tsang).

    Parameters
    ----------
    shape:
        k > 0
    size:
        number of samples
    rng:
        NumPy Generator
    scale:
        theta > 0 (multiplicative scale)
    """

    k = float(shape)
    theta = float(scale)
    if k <= 0:
        raise ValueError("shape must be > 0")
    if theta <= 0:
        raise ValueError("scale must be > 0")

    # k < 1: boost to k+1 and apply power transform
    if k < 1:
        g = gamma_rvs_numpy(k + 1.0, size, rng, scale=1.0)
        u = rng.random(size)
        return theta * g * (u ** (1.0 / k))

    # k >= 1: Marsaglia–Tsang
    d = k - 1.0 / 3.0
    c = 1.0 / np.sqrt(9.0 * d)

    out = np.empty(size, dtype=float)
    filled = 0

    while filled < size:
        n = size - filled
        x = rng.standard_normal(n)
        v = 1.0 + c * x
        v = v * v * v  # (1 + c x)^3
        u = rng.random(n)

        positive = v > 0

        # First (cheap) acceptance
        accept = positive & (u < 1.0 - 0.0331 * (x**4))

        # Second acceptance (log test)
        log_v = np.zeros_like(v)
        log_v[positive] = np.log(v[positive])

        accept2 = positive & (~accept) & (np.log(u) < 0.5 * x * x + d * (1.0 - v + log_v))

        accept = accept | accept2
        accepted = d * v[accept]

        take = min(accepted.size, n)
        out[filled : filled + take] = accepted[:take]
        filled += take

    return theta * out


def chisquare_rvs_numpy(df: float, size: int, rng: np.random.Generator) -> np.ndarray:
    """Sample ChiSquare(df) using Gamma(df/2, scale=2)."""
    df = float(df)
    if df <= 0:
        raise ValueError("df must be > 0")
    return gamma_rvs_numpy(df / 2.0, size, rng, scale=2.0)


def f_rvs_numpy(dfn: float, dfd: float, size: int, rng: np.random.Generator) -> np.ndarray:
    """Sample F(dfn, dfd) via ratio of independent chi-squares."""

    u = chisquare_rvs_numpy(dfn, size, rng)
    v = chisquare_rvs_numpy(dfd, size, rng)
    return (u / dfn) / (v / dfd)


# Monte Carlo validation against theory
n = 200_000
params = (5.0, 10.0)
samples_numpy = f_rvs_numpy(*params, n, rng)

m_theory, v_theory = stats.f(*params).stats(moments="mv")
np.mean(samples_numpy), np.var(samples_numpy), m_theory, v_theory
(1.2529035200365988, 1.3745254004774181, 1.25, 1.3541666666666667)

8) Visualization#

We will visualize:

  • the PDF for several parameter settings

  • the CDF and an empirical CDF from Monte Carlo samples

  • a histogram of simulated samples overlaid with the theoretical PDF

dfn, dfd = 5.0, 10.0
x = np.linspace(0.001, 6.0, 600)

# PDF line + Monte Carlo histogram
n_vis = 60_000
samples = f_rvs_numpy(dfn, dfd, n_vis, rng)

fig = go.Figure()
fig.add_trace(
    go.Histogram(
        x=samples,
        histnorm="probability density",
        nbinsx=120,
        name="Monte Carlo",
        opacity=0.55,
    )
)
fig.add_trace(go.Scatter(x=x, y=f_pdf(x, dfn, dfd), mode="lines", name="PDF (theory)", line=dict(width=3)))

fig.update_layout(
    title=f"F(dfn={dfn}, dfd={dfd}): histogram vs PDF",
    xaxis_title="x",
    yaxis_title="density",
    bargap=0.02,
)
fig.show()

# CDF vs empirical CDF
xs = np.sort(samples)
ecdf = np.arange(1, xs.size + 1) / xs.size

fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=f_cdf(x, dfn, dfd), mode="lines", name="CDF (theory)", line=dict(width=3)))
fig.add_trace(go.Scatter(x=xs, y=ecdf, mode="lines", name="CDF (empirical)", line=dict(width=2, dash="dot")))

fig.update_layout(
    title=f"F(dfn={dfn}, dfd={dfd}): empirical CDF vs theory",
    xaxis_title="x",
    yaxis_title="CDF",
)
fig.show()

9) SciPy Integration (scipy.stats.f)#

SciPy provides a robust implementation with location/scale support.

from scipy import stats

dist = stats.f(dfn, dfd)          # standard form
pdf_vals = dist.pdf(x)
cdf_vals = dist.cdf(x)
rv = dist.rvs(size=1000, random_state=rng)

# Fit (MLE). By default SciPy fits dfn, dfd, loc, scale.
# If you want the *standard* F family, fix loc=0 and scale=1.
dfn_hat, dfd_hat, loc_hat, scale_hat = stats.f.fit(data, floc=0, fscale=1)

Below is a short, end-to-end example.

dfn, dfd = 4.0, 12.0
x0 = np.array([0.25, 0.5, 1.0, 2.0, 4.0])

dist = stats.f(dfn, dfd)

pdf = dist.pdf(x0)
cdf = dist.cdf(x0)

rv = dist.rvs(size=5, random_state=rng)

# Fitting: generate data, then recover parameters.
rng_fit = np.random.default_rng(7)
data = stats.f(dfn, dfd).rvs(size=4000, random_state=rng_fit)

dfn_hat, dfd_hat, loc_hat, scale_hat = stats.f.fit(data, floc=0, fscale=1)

{
    "x": x0,
    "pdf": pdf,
    "cdf": cdf,
    "rvs": rv,
    "fit": {"dfn_hat": dfn_hat, "dfd_hat": dfd_hat, "loc": loc_hat, "scale": scale_hat},
}
{'x': array([0.25, 0.5 , 1.  , 2.  , 4.  ]),
 'pdf': array([0.61496, 0.67983, 0.46719, 0.15676, 0.02124]),
 'cdf': array([0.09586, 0.26351, 0.55505, 0.84137, 0.97256]),
 'rvs': array([1.41411, 1.15258, 0.53981, 1.69128, 0.04488]),
 'fit': {'dfn_hat': 4.006050540012662,
  'dfd_hat': 11.86095144322162,
  'loc': 0,
  'scale': 1}}

10) Statistical Use Cases#

A) Hypothesis testing#

  1. Variance ratio test (Gaussian assumption)

If \(X_1,\dots,X_{n_1}\) and \(Y_1,\dots,Y_{n_2}\) are independent normal samples with equal variance under \(H_0\), then

\[ \frac{S_X^2}{S_Y^2} \sim \mathrm{F}(n_1-1,\,n_2-1) \]

where \(S^2\) denotes the sample variance.

  1. ANOVA / regression

F statistics often take the form

\[ F = \frac{\text{signal per parameter}}{\text{noise per degree of freedom}}, \]

and compare explained vs unexplained variability.

B) Bayesian modeling (variance ratios)#

For a normal sample with Jeffreys prior \(p(\mu,\sigma^2)\propto 1/\sigma^2\), the posterior for \(\sigma^2\) is an inverse-chi-square. For two independent groups, the posterior for the variance ratio satisfies

\[ \frac{\sigma_1^2}{\sigma_2^2}\,\bigg|\,\text{data} \;\sim\; \frac{s_1^2}{s_2^2}\;\mathrm{F}(\nu_2,\nu_1), \qquad \nu_i=n_i-1. \]

This yields a closed-form credible interval for \(\sigma_1^2/\sigma_2^2\).

C) Generative modeling#

The F distribution is a flexible heavy-tailed positive distribution. One trick is to use it as a random scale:

\[ S \sim \mathrm{F}(d_1,d_2),\qquad Y = \sqrt{S}\,\varepsilon,\ \varepsilon\sim\mathcal{N}(0,1). \]

This produces occasional large scales (outliers) while remaining easy to simulate.

# A) Variance ratio test example
rng_test = np.random.default_rng(0)

n1, n2 = 25, 18
sigma = 2.0
x = rng_test.normal(0.0, sigma, size=n1)
y = rng_test.normal(0.0, sigma, size=n2)

s1 = np.var(x, ddof=1)
s2 = np.var(y, ddof=1)
F_stat = s1 / s2

# Two-sided p-value
p_left = stats.f.cdf(F_stat, n1 - 1, n2 - 1)
p_right = stats.f.sf(F_stat, n1 - 1, n2 - 1)
p_two_sided = 2 * min(p_left, p_right)

# B) Bayesian variance ratio posterior (Jeffreys prior)
nu1, nu2 = n1 - 1, n2 - 1
scale_ratio = s1 / s2

# Posterior: (sigma1^2/sigma2^2) | data ~ scale_ratio * F(nu2, nu1)
ci_level = 0.95
alpha = 0.5 * (1 - ci_level)
q_lo = stats.f.ppf(alpha, nu2, nu1)
q_hi = stats.f.ppf(1 - alpha, nu2, nu1)
ci_post = (scale_ratio * q_lo, scale_ratio * q_hi)

# Monte Carlo check: sample sigma_i^2 ~ Inv-chi-square(nu_i, s_i^2)
mc = 80_000
sigma1_sq = (nu1 * s1) / chisquare_rvs_numpy(nu1, mc, rng_test)
sigma2_sq = (nu2 * s2) / chisquare_rvs_numpy(nu2, mc, rng_test)
ratio_mc = sigma1_sq / sigma2_sq
ci_mc = (np.quantile(ratio_mc, alpha), np.quantile(ratio_mc, 1 - alpha))

# C) Generative modeling: heavy-tailed scale mixture
m = 120_000
s = f_rvs_numpy(5.0, 5.0, m, rng_test)
eps = rng_test.standard_normal(m)
y_heavy = np.sqrt(s) * eps
y_normal = eps

# Compare tail probabilities
thr = 3.0
p_tail_heavy = np.mean(np.abs(y_heavy) > thr)
p_tail_normal = np.mean(np.abs(y_normal) > thr)

{
    "variance_ratio_test": {"F": F_stat, "p_two_sided": p_two_sided},
    "posterior_CI_closed_form": ci_post,
    "posterior_CI_monte_carlo": ci_mc,
    "tail_P(|Y|>3)": {"heavy": p_tail_heavy, "normal": p_tail_normal},
}
{'variance_ratio_test': {'F': 1.06647032697333,
  'p_two_sided': 0.9077897055189864},
 'posterior_CI_closed_form': (0.4166185566400728, 2.545110186554938),
 'posterior_CI_monte_carlo': (0.418158015120004, 2.5405681465872147),
 'tail_P(|Y|>3)': {'heavy': 0.033525, 'normal': 0.0026333333333333334}}

11) Pitfalls#

  • Invalid parameters: \(d_1\le 0\) or \(d_2\le 0\) is invalid.

  • Non-existent moments: if \(d_2\le 2\), the mean is infinite; if \(d_2\le 4\), the variance is infinite; etc.

  • Numerical stability: for extreme \(x\) or degrees of freedom, prefer logpdf over pdf.

  • loc/scale confusion in SciPy: scipy.stats.f is a four-parameter family by default; fix loc=0, scale=1 if you mean the standard F distribution.

  • Fitting: MLE can be sensitive to outliers (heavy tail) and may converge slowly; check diagnostics and consider robust alternatives when appropriate.

12) Summary#

  • The F distribution models a ratio of two variance-like quantities: \((\chi^2_{d_1}/d_1)/(\chi^2_{d_2}/d_2)\).

  • The PDF/CDF have closed forms involving the Beta function and the regularized incomplete beta.

  • Moments exist only up to order \(k<d_2/2\); in particular, the mean requires \(d_2>2\) and the variance requires \(d_2>4\).

  • Sampling is straightforward via Gamma/chi-square sampling; a NumPy-only Marsaglia–Tsang Gamma sampler works well.

  • In practice, scipy.stats.f provides evaluation, simulation, and MLE fitting.

References

  • Marsaglia, G. & Tsang, W. W. (2000). A Simple Method for Generating Gamma Variables.

  • Johnson, Kotz, and Balakrishnan. Continuous Univariate Distributions, Volume 2 (2nd ed.), Wiley, 1995.

  • SciPy documentation: scipy.stats.f, scipy.special.betainc.